<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><title>Utility to load datasets from AWS DMAP Data Commons, into memory — dataload_from_aws • EJAM</title><!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png"><link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png"><link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png"><link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png"><link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png"><link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png"><script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><link href="../deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet"><script src="../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><link href="../deps/font-awesome-6.4.2/css/all.min.css" rel="stylesheet"><link href="../deps/font-awesome-6.4.2/css/v4-shims.min.css" rel="stylesheet"><script src="../deps/headroom-0.11.0/headroom.min.js"></script><script src="../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><script src="../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><script src="../deps/clipboard.js-2.0.11/clipboard.min.js"></script><script src="../deps/search-1.0.0/autocomplete.jquery.min.js"></script><script src="../deps/search-1.0.0/fuse.min.js"></script><script src="../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><script src="../pkgdown.js"></script><meta property="og:title" content="Utility to load datasets from AWS DMAP Data Commons, into memory — dataload_from_aws"><meta name="description" content="Utility to load datasets from AWS DMAP Data Commons, into memory"><meta property="og:description" content="Utility to load datasets from AWS DMAP Data Commons, into memory"><meta property="og:image" content="https://usepa.github.io/EJAM/logo.svg"></head><body>
    <a href="#main" class="visually-hidden-focusable">Skip to contents</a>


    <nav class="navbar navbar-expand-lg fixed-top bg-light" data-bs-theme="light" aria-label="Site navigation"><div class="container">

    <a class="navbar-brand me-2" href="../index.html">EJAM</a>

    <small class="nav-text text-warning me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="Released version">2.32.0</small>


    <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
      <span class="navbar-toggler-icon"></span>
    </button>

    <div id="navbar" class="collapse navbar-collapse ms-3">
      <ul class="navbar-nav me-auto"><li class="active nav-item"><a class="nav-link" href="../reference/index.html">Reference</a></li>
<li class="nav-item dropdown">
  <button class="nav-link dropdown-toggle" type="button" id="dropdown-articles" data-bs-toggle="dropdown" aria-expanded="false" aria-haspopup="true">Articles</button>
  <ul class="dropdown-menu" aria-labelledby="dropdown-articles"><li><hr class="dropdown-divider"></li>
    <li><h6 class="dropdown-header" data-toc-skip>Overview for EJAM Users</h6></li>
    <li><a class="dropdown-item" href="../articles/0_whatis.html">What is EJAM</a></li>
    <li><a class="dropdown-item" href="../articles/0_webapp.html">Using EJAM</a></li>
    <li><hr class="dropdown-divider"></li>
    <li><h6 class="dropdown-header" data-toc-skip>For analysts using R</h6></li>
    <li><a class="dropdown-item" href="../articles/1_installing.html">Installing the EJAM R package</a></li>
    <li><a class="dropdown-item" href="../articles/2_quickstart.html">Quick Start Guide</a></li>
    <li><a class="dropdown-item" href="../articles/3_analyzing.html">Basics of Using EJAM for Analysis in RStudio</a></li>
    <li><a class="dropdown-item" href="../articles/4_advanced.html">Advanced Features</a></li>
  </ul></li>
<li class="nav-item"><a class="nav-link" href="../news/index.html">Changelog</a></li>
      </ul><ul class="navbar-nav"><li class="nav-item"><form class="form-inline" role="search">
 <input class="form-control" type="search" name="search-input" id="search-input" autocomplete="off" aria-label="Search site" placeholder="Search for" data-search-index="../search.json"></form></li>
      </ul></div>


  </div>
</nav><div class="container template-reference-topic">
<div class="row">
  <main id="main" class="col-md-9"><div class="page-header">
      <img src="../logo.svg" class="logo" alt=""><h1>Utility to load datasets from AWS DMAP Data Commons, into memory</h1>
      <small class="dont-index">Source: <a href="https://github.com/USEPA/EJAM/blob/HEAD/R/dataload_from_aws.R" class="external-link"><code>R/dataload_from_aws.R</code></a></small>
      <div class="d-none name"><code>dataload_from_aws.Rd</code></div>
    </div>

    <div class="ref-description section level2">
    <p>Utility to load datasets from AWS DMAP Data Commons, into memory</p>
    </div>

    <div class="section level2">
    <h2 id="ref-usage">Usage<a class="anchor" aria-label="anchor" href="#ref-usage"></a></h2>
    <div class="sourceCode"><pre class="sourceCode r"><code><span><span class="fu">dataload_from_aws</span><span class="op">(</span></span>
<span>  varnames <span class="op">=</span> <span class="va">.arrow_ds_names</span><span class="op">[</span><span class="fl">1</span><span class="op">:</span><span class="fl">3</span><span class="op">]</span>,</span>
<span>  ext <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">".arrow"</span>, <span class="st">".rda"</span><span class="op">)</span><span class="op">[</span><span class="fl">2</span><span class="op">]</span>,</span>
<span>  fun <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"arrow::read_ipc_file"</span>, <span class="st">"load"</span><span class="op">)</span><span class="op">[</span><span class="fl">2</span><span class="op">]</span>,</span>
<span>  envir <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/environment.html" class="external-link">globalenv</a></span><span class="op">(</span><span class="op">)</span>,</span>
<span>  mybucket <span class="op">=</span> <span class="st">"dmap-data-commons-oa"</span>,</span>
<span>  mybucketfolder <span class="op">=</span> <span class="st">"EJAM"</span>,</span>
<span>  folder_local_source <span class="op">=</span> <span class="st">"./data/"</span>,</span>
<span>  justchecking <span class="op">=</span> <span class="cn">FALSE</span>,</span>
<span>  check_server_even_if_justchecking <span class="op">=</span> <span class="cn">TRUE</span>,</span>
<span>  testing <span class="op">=</span> <span class="cn">FALSE</span></span>
<span><span class="op">)</span></span></code></pre></div>
    </div>

    <div class="section level2">
    <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#arguments"></a></h2>


<dl><dt id="arg-varnames">varnames<a class="anchor" aria-label="anchor" href="#arg-varnames"></a></dt>
<dd><p>character vector of the quoted names of the data objects like blockwts or quaddata</p></dd>


<dt id="arg-ext">ext<a class="anchor" aria-label="anchor" href="#arg-ext"></a></dt>
<dd><p>like ".arrow" file extension</p></dd>


<dt id="arg-fun">fun<a class="anchor" aria-label="anchor" href="#arg-fun"></a></dt>
<dd><p>like "arrow::read_ipc_file" or "load" to use when reading</p></dd>


<dt id="arg-envir">envir<a class="anchor" aria-label="anchor" href="#arg-envir"></a></dt>
<dd><p>e.g., globalenv() or parent.frame()</p></dd>


<dt id="arg-mybucket">mybucket<a class="anchor" aria-label="anchor" href="#arg-mybucket"></a></dt>
<dd><p>where in AWS, like</p></dd>


<dt id="arg-mybucketfolder">mybucketfolder<a class="anchor" aria-label="anchor" href="#arg-mybucketfolder"></a></dt>
<dd><p>where in AWS, like EJAM</p></dd>


<dt id="arg-folder-local-source">folder_local_source<a class="anchor" aria-label="anchor" href="#arg-folder-local-source"></a></dt>
<dd><p>path of folder (not ending in forward slash) to
look in for locally saved copies during development
to avoid waiting for download from a server.</p></dd>


<dt id="arg-justchecking">justchecking<a class="anchor" aria-label="anchor" href="#arg-justchecking"></a></dt>
<dd><p>set to TRUE to get object size (and confirm file is accessible/exists)</p></dd>


<dt id="arg-check-server-even-if-justchecking">check_server_even_if_justchecking<a class="anchor" aria-label="anchor" href="#arg-check-server-even-if-justchecking"></a></dt>
<dd><p>set this to TRUE to stop checking server to see if files are there
when justchecking = TRUE. But server is always checked if justchecking = FALSE.</p></dd>


<dt id="arg-testing">testing<a class="anchor" aria-label="anchor" href="#arg-testing"></a></dt>
<dd><p>only for testing</p></dd>

</dl></div>
    <div class="section level2">
    <h2 id="value">Value<a class="anchor" aria-label="anchor" href="#value"></a></h2>
    <p>nothing - just loads data into environment (unless justchecking=T)</p>
    </div>
    <div class="section level2">
    <h2 id="details">Details<a class="anchor" aria-label="anchor" href="#details"></a></h2>
    <p>See source code for details.</p>
<p>***  tries dataload_from_local() first
(at least during development) to avoid slow downloads.</p>
<p>Also see <a href="https://shiny.posit.co/r/articles/improve/scoping/" class="external-link">https://shiny.posit.co/r/articles/improve/scoping/</a></p>
<p>These files are public-facing – no credentials required.</p>
<p>Use EJAM:::dataload_from_aws(justchecking=TRUE)</p>
<p>or EJAM:::datapack("EJAM") to get info</p>
<p>or tables()</p>
<p>or object.size(quaddata)</p>
<p>blockid2fips was used only in  state_from_blockid(), which is no longer used by testpoints_n(),
so not loaded unless/until needed.
Avoids loading the huge file "blockid2fips" (100MB) and just uses "bgid2fips" (3MB) as needed, that is only 3% as large in memory.
blockid2fips was roughly 600 MB in RAM because it stores 8 million block FIPS as text.</p>
<p>Files may include the following:</p><ul><li><p>frs               (150 MB .arrow file, approx 700 MB RAM)</p></li>
<li><p>frs_by_programid  (approx 500 MB RAM)</p></li>
<li><p>frs_by_sic        (approx  63 MB RAM)</p></li>
<li><p>frs_by_naics      (approx  60 MB RAM)</p></li>
<li><p>frs_by_mact</p></li>
<li><p>quaddata     (168 MB on disk, 229 MB RAM)</p></li>
<li><p>blockid2fips ( 20 MB on disk, 621 MB RAM!) No longer needed.</p></li>
<li><p>blockpoints  ( 86 MB on disk, 164 MB RAM)</p></li>
<li><p>blockwts     ( 31 MB on disk, 196 MB RAM)</p></li>
<li><p>bgej         (123 MB RAM)</p></li>
<li><p>bgid2fips    ( 18 MB RAM)</p></li>
</ul></div>
    <div class="section level2">
    <h2 id="see-also">See also<a class="anchor" aria-label="anchor" href="#see-also"></a></h2>
    <div class="dont-index"><p><code><a href="datapack.html">datapack()</a></code> <code><a href="dataload_from_pins.html">dataload_from_pins()</a></code> <code><a href="dataload_from_local.html">dataload_from_local()</a></code> <code><a href="dataload_from_package.html">dataload_from_package()</a></code> <code><a href="indexblocks.html">indexblocks()</a></code> <code><a href="https://rdrr.io/r/base/ns-hooks.html" class="external-link">.onAttach()</a></code></p></div>
    </div>

  </main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
    </nav></aside></div>


    <footer><div class="pkgdown-footer-left">
  <p>US EPA 2025</p>
</div>

<div class="pkgdown-footer-right">
  <p>EJAM Version 2.32.0</p>
</div>

    </footer></div>





  </body></html>

